-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
epoching: more test infra and some fuzz tests on keeper functionalities #60
Conversation
f.Add(int64(11111)) | ||
f.Add(int64(22222)) | ||
f.Add(int64(55555)) | ||
f.Add(int64(12312)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why bother with these? If something fails, I thought the application will save it into a place where it can act as a regression test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed the case when we run individual fuzz tests. However, CI will only run the test cases added by f.Add
without trying other random values generated from the corpus. Will having more corpus be beneficial for CI? If not then I will keep only one of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so you want it to run at least a few times. We discussed with @vitsalis that we could add -fuzztime
to the CI to run it for a limited amount of iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you guys think about doing something like f.Add(time.Now().UnixMilli())
so the tests run as unit tests are not always the same, and you aren't forced to include this arbitrary looking boilerplate all the time. It would ensure that it runs them at least once, and if they fail then hopefully it prints the troublesome seed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The counter argument was that it makes builds non-deterministic. It can be annoying if random tests fail sometimes, not other times, but over a long period of time it would catch more bugs than relying on the same seeds. We could even do something like addRandomSeeds(f, 100)
to add exactly 100 random seeds, which is what other frameworks are doing (ie. they achieve 100 passes by default for each test, rather than run forever like Go).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could add -fuzztime to the CI to run it for a limited amount of iterations.
Are you aware of any projects using such strategy? I feel that this requires a very powerful VM to run the CI. My CPU was super hot when executing fuzz tests and sometimes a fuzz test takes >10s to generate the initial corpus.
We could even do something like addRandomSeeds(f, 100) to add exactly 100 random seeds, which is what other frameworks are doing (ie. they achieve 100 passes by default for each test, rather than run forever like Go).
This is a good idea. Basically we limit the number of tries (rather than the running time) for fuzz tests. I will look into this approach and find a way to try it in the subsequent PRs. Hopefully the CI will be happy to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the QuickCheck family runs 100 tests by default and it's up to you to tweak the numbers. I also want tests to finish in reasonable time, max 10 seconds, so if I see that they take longer then I have to reign in how much data I generate.
Maybe I don't need 1000 blocks with up to 100 transactions in each just to test something simple, and maybe I let my tree generation do too much branching and my tree of 10 depth is exponentially wide. Generating random bytes can also get out of hand, like opaque payloads, we don't need them to be hundreds of kilobytes long.
So it's a good indication of where time is not spent well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And you are right I'd rather know that a particular test passed 100 runs than that some VM spent overall 1 minute testing all fuzz tests and may or may not have covered them equally.
require.Equal(t, sdk.Uint64ToBigEndian(uint64(i)), msg.MsgId) | ||
require.Nil(t, msg.Msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth creating a constructor method like NewQueuedMessage(txid, msg)
and let it calculate the hash itself, so we're not able to generate invalid ones. Just an idea.
ctx = nextBlock(app, ctx) | ||
} | ||
|
||
// check whether the validator set remains the same or not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay but why would it change? There are no transactions, and even if there were the test doesn't say how long an epoch is.
Shouldn't the test generate random registrations and check that the validator set doesn't change during an epoch, but it does at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea! However this requires mocking MsgCreateValidator
or even MsgWrappedCreateValidator
that includes BLS stuff. I have added a TODO on this and will do it in the subsequent PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, you found a lot of good techniques for generating valid looking blocks!
Partially fixes BM-63
This PR introduces more test infra for the epoching module, and adds some (stateful) fuzz tests on keeper functionalities. Specifically, it introduces the following:
BabylonApp
with a number of validatorsepoch_msg_queue.go
,epoch_val_set.go
,epochs.go
In addition, this PR also removes some redundant unit tests (which can be safely replaced by more powerful fuzzing ones), and fixes two bugs in epoching module's
BeginBlock
and order of initialising hooks. The bugs are revealed by the fuzz tests.Some stateful fuzz tests are left to future PRs, including:
One of my feelings is that some of the test infra might be useful for other modules as well, and thus can be part of the project-level SimApp in the future. This depends on the team's decision on the integration test approach.